Testing in the Presence of Nuisance Parameters: 
Some Comments on Tests Post-Model-Selection 
and Random Critical Values 

QJ . Hannes Leeb and Benedikt M. Potscher 

^ I Department of Statistics, University of Vienna 

Preliminary version: May 25, 2012 
This version: September 20, 2012 

Abstract 

We point out that the ideas underlying some test procedures recently 
proposed for testing post-model-selection (and for some other test prob- 
lems) in the econometrics literature have been around for quite some time 
in the statistics literature. We also sharpen some of these results in the 
statistics literature and show that some of the proposals in the economet- 
rics literature lead to tests that do not have the claimed size properties. 

1 Introduction 

Suppose we have a sequence of statistical experiments given by a family of 
probability measures {Pn,a,0 ■ a G A, G B} where a is a "parameter" of inter- 
est, and /3 is a "nuisance-parameter". Often, but not always, A and B will be 
subsets of Euclidean space. Suppose the researcher wants to base a test for the 
null- hypothesis Ho : a = ao on the real- valued test-statistic Tn{ao), with large 
values of r„(ao) being taken as indicative for violation of -ffoQ Suppose further 
that the distribution of T„(ao) under Hq depends on the nuisance parameter (3. 
This leads to the key question: How should the critical value then be chosen? 
[Of course, if another pivotal test-statistic is available, this one could be used. 
However, we consider here the case where a pivotal test-statistic either does not 
exist, or where the researcher - for better or worse - insists on using T„(q!o)-] 
In this situation a standard way (see, e.g., Bickel and Doksum (1977), p. 170) to 
deal with this problem is to choose as critical value 

c„.sup(<5) = sup C„.^((5), (1) 

/3G-B 



^This framework obviously allows for "one-sided" as well as for "two-sided" alternatives 
(when these concepts make sense) by a proper definition of the test statistic. 
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where < 5 < 1 and where Cn,i3{S) satisfies Pn,ao,p {Tn{ao) > Cn.p{S)) = i.e., 
Cra,/3(<5) is a (1 — ^)-quantile of the distribution of T„(ao) under Pn,ao,p- [We 
assume here the existence of such a c„_^((5), but we do not insist that it is chosen 
as the smahest possible number satisfying the above condition, ahhough this 
will usually be the case.] While the resulting test which rejects Hq for 

T'n(ao) > c„,sup(^) (2) 

certainly is a level 5 test (i.e., has size < (5), the conservatism caused by taking 
the supremum will often result in poor power properties, especially for values of 
P for which c„^^((5) is much smaller than c„^sup(^)- The test obtained from ([1} 
and ^ above (and an asymptotic variant thereof) is precisely what Andrews 
and Guggenberger (2009) call a "size-corrected fixed critical value" test. 

An obvious alternative idea, which is much less conservative, is to use c^^ ^ {5) 

as a random critical value, where /3„ is an estimator for j3 (taking its values in 
i?), and to reject _ffo if 

T^n(ao)>c„jJ5) (3) 

obtains (measurability of ^ {5) being assumed) . This choice of critical value 
can be viewed as a parametric bootstrap procedure. However, 

Pn,ao,l3 (r„(ao) > C„3^(^)^ > PnM„,fi (T'„(ao) > C„,sup(^)) 

clearly holds for every /?, indicating that the test using the random critical 
value ^ (S) may not be a level d test, but may have size larger than S. This 
was already noted by Loh (1985). A precise result in this direction, which is a 
variation of Theorem 2.1 in Loh (1985) is as follows. 

Proposition 1 Suppose that there exists a (3™^^ = P^^^i^) s'^c/i that Cn,i3^^^{S) = 
Cn,sup{5). Then 

f'n.ao./S"- (c„J„('^) < Tniao) < C„,sup(^)) > (4) 

implies 

sup P„,ao,i3 (r„(ao) > c„ a ((5) ) > S, (5) 

i.e., the test using the random critical value c^^ ^ (6) does not have level S. 
More generally, if Cn is any random critical value satisfying c„ < c„^^max((5)(= 
Cn,sup('^)) with Pn^ao,p^'^^-pfobahility 1, then ^ still implies if in both ex- 
pressions p {5) is replaced by £„ . [The result also holds if the random critical 
values Cn also depend on some additional randomization mechanism.] 

Proof. Observe that -„ [5) < c„_sup((5) always holds. But then the l.h.s. of 
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([5]) is not less than 



= ^'n,Qo,/3S"'" (T'n(ao) > c„,/3max ((5)) + P„^Q„^/3max |^c„^^^((5) < T„(ao) < C„,siip(^) 

= 5 + Pn.ao^P^^^ ('^nJ„('^) r„(ao) < C„,sup('5)) > 5 

in view of Q. The proof for the second claim is completely analogous. ■ 

To better appreciate condition ^ consider the case where Cn,p{S) is uniquely 
maximized at IS™'^^ and Pn,Qo,/3;n=»' (/?„ 7^ Z^™'*'^) is positive. Then Pn^Qo,^™"" (c^^ ^ (5) 
Cn, sup(^)) is positive and therefore we can expect condition Q to hold, unless 
there exists a quite strange dependence structure between /3„ and Tn{ao). The 
same argument applies in the more general situation where /S™^'' is not unique 
but Pn,ao,/3»''='(,^„ ^ arg max c„^^((5)) > holds. 

In the same vein, it is also useful to note that Condition ^ can equivalently 

be stated as follows: The conditional distribution P„,Qo,^m'>=' (^Pn(ao) < • | 

of r„(ao) given /3„ puts positive mass on the interval {c^ ^ (<5), c„.sup((5)] for a 

set of /?„ that has positive probability under Pn,aa,i3""^'^- [Also note that Con- 
dition (HI implies that ^ (5) < c„^sup((5) must hold with positive Pn.ao,^"'"'" 
probability.] A sufficient condition for this then clearly is that for a set of /3„'s 
of positive Pn ag^/jmax-probability we have that (i) ^ {S) < c„.sup((5), and (ii) 

the conditional distribution P„ ^^^^max ^r„(ao) < • | /3„^ puts positive mass on 

every non-empty interval. The analogous result holds for the case where c„ 
replaces ^ {S) (and conditioning is w.r.t. c„), see Lemma|3]in the Appendix 
for a formal statement. 

The observation, that the test ^ based on the critical value c^^ ^ {S) typi- 
cally will not be a level S test, has led Loh (1985) and subsequently Berger and 
Boos (1994) and Silvapulle (1996) to consider the following procedure (or vari- 
ants thereof) which leads to a level 6 test, but is somewhat less "conservative" 
than the test given by 

Let /„ be a random set in B satisfying 

inf P„,„o,^ {13 e /„) > 1 - r?„, 

where < < 5. I.e., /„ is a confidence set for the nuisance parameter /3 with 
infimal coverage probability not less than 1 — rj^ (provided a = ao). Define a 
random critical value via 

Cn,r,^,LohiS) = SUp C„,/3((5 - ??„). (6) 



Then we have 

sup Pn,ao,l3 {Tn{cto) > m < s. 

This is seen as foUows: For every (3 G B 

Pn,OiQ ,/3 (T„(ao) > ^71, 7] oh 

+Pn,ao,l3 (Tniao) > Cn^r,^,Loh{^) , P ^ In) 

< Pn^ao.P (Tniao) > C„,/j((5 - ?7„),/3 e In) + Vr 

< Pn^ao^P {Tn{ao) > Cn,fi{5 - ??„)) + 

= (5 - ?7„ + 77„ = 5. 

Hence, the critical value Cn^r)^,Loh{5) results in a test that is guaranteed to be 
level 5. In fact, its size can also be lower bounded by (5 — rj^ provided there 
exists a ^"''''((5 - ??„) satisfying c„,^max(5_^^) ((5 - -q^) = sup^g^ c„,^((5 - ?7„): 
This follows since 

sup Pn,ao,l3 (T„(ao) > C„ 

,7]^,Loh 

peB 

> sup Pn,ao,fl Tn{ao) > SUp Cn,l3{S ~ Tj.^) 
/3G-B y /3e-B y 

= sup Pn,ao,P {Tn{ao) > C„.^max(5_^ )((5 - Ty^)) 
/3GS 

> -Pn,ao,,3™'"'(5-»),J {Tn{ao) > Cn,,3™'™(<5-r,„) (<5 - '7ri)) 

= '5-ry„. (7) 

Apparently unaware of Loh (1985), Berger and Boos (1994), and SilvapuUe 
(1996), the just given construction of a critical value has also been suggested by 
DiTraglia (2011) and McCloseky (2011). 

The test based on the random critical value Cn.ri .Lohi^) may have size strictly 
smaller than 6. This suggests that this test will not improve over the conser- 
vative test based on c„_sup(<5) for all values of /3: We can expect that the test 
based on ^ will sacrifice some power when compared with the conservative 
test (g]) when the true /? is close to /3™''''((5) or ;5™'"'((5 - rj^); however, we can 
often expect a power gain for values of /3 that are "far away" from /3™^'^((5) and 
Pn^^i^ — 77„), as we then typically will have that Cn^jj^.LohiS) is smaller than 
Cn,sup{S)- Hence, each of the two tests will typically have a power advantage 
over the other in certain parts of the parameter space B. 

In an attempt to get the power advantages of both tests, McCloseky (2011) 
suggested (in the context of testing post model selection) to use random critical 
values of the form 

^n.rj^^ ,inin 

(5) = min (S)) . (8) 

In fact, his proposal, Cn,Mcc{S) say, is even smaller, and obtained by taking the 
minimum of critical values of the form ([S]) when rj^ runs through a finite set of 
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values. However, by construction the critical value ([SJ satisfies 



{S} < c, 



•n,sup 



and hence can be expected to fall under the wrath of Proposition[T] given above. 
Thus it can be expected not to deliver a test that has level S, but has a size 
that exceeds S. [In fact, McCloseky's (2011) suggestion Cn,Mcc{^) being even 
less than or equal to Cn,rj ,min exacerbates the problem.] So while McCloseky's 
proposal will reject more often than the tests based on ([2|) or on ([6]), it does 
so by violating the size constraint. Hence it suffers from the same problems as 
the parametric bootstrap test We make the trivial observation that the 
lower bound ([7]) also holds if c„^^^^inin(i5) instead of Cn,n^,Loh{S) is used (since 

Cn,,,„,min(<5) < Cn,r,^,LohiS) holds). 

While the above proposition tells us that the test based on the random crit- 
ical values figuring in the proposition, like c^^ ^ (S) or c„_^^_,nin('5), will typically 
not have level 5, it leaves open the possibility that the overshoot of the size 
over S may converge to zero as sample size goes to infinity and hence the test 
would be at least asymptotically of level 6. In sufficiently "regular" testing 
problems this will indeed be the case. However, we next provide an example 
where the overshoot does not converge to zero for the tests based on ^ {6) 
or c„^,,^^inin(^), and hence these tests are not level S even asymptotically. 

2 An Illustrative Example 

In the following we shall - for the sake of exposition - use a very simple example 
to illustrate the issues involved. Consider the linear regression model 



under the "textbook" assumptions that the errors et are i.i.d. A^(0,cr^), > 0, 
the nonstochastic n x 2 regressor matrix X has full rank (implying n > 1), and 
satisfies X'X/n — )■ Q > as n — >■ oo. The variables yt, Xu, as well as the errors 
et can be allowed to depend on sample size n (in fact may be defined on a sample 
space that itself depends on n), but we do not show this in the notation. For 
simplicity, we shall also assume that the error variance is known and equals 
1. It will be convenient to write the matrix {X'X/n)~^ as 



The elements of the limit of this matrix will be denoted by CTq etc. It will 
prove useful to define ~ (Ja^i^^n /(o'a,„cr/3,„), i.e., p„ is the correlation coeffi- 
cient between the least-squares estimators for a and (3 in model Its limit 
will be denoted by p^. Note that \Poo\ < 1 holds since Q > has been assumed. 

As in Leeb and Potscher (2005) we shall consider two candidate models from 
which we select on the basis of the data: The unrestricted model denoted by 



axti + I3xt2 + et 



{l<t<n) 



(9) 
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U which uses both regrcssors xn and Xt2, and the restricted model denoted 
by R which uses only the regressor xti (and thus corresponds to imposing the 
restriction ^ = 0). The least-squares estimators for a and /3 in the unrestricted 
model will be denoted by an{U) and /?„({/), respectively. The least-squares 
estimator for a in the restricted model will be denoted by Q;„(i?), and we shall set 
/3„(i?) = 0. We shall decide between the competing models U and R depending 
on whether the test-statistic \\/nj3{U„)/ai3^n\ > c or not, where c > is a user- 
specified cut-off point independent of sample size (in line with the fact that 
we consider conservative model selection). That is, we select the model M„ 
according to 

' U if |V^;9„(C/)/a0,„| >c, 

R otherwise. 



Mn 



We now want to test the hypothesis Hq : a = ao versus Hi : a > uq and we 
insist, for better or worse, on using as a test-statistic 

T„(ao) = [n'/' {a{R) ' «o) / (a„,„ (l - pI)'^^)] 1(M„ = R) 

That is, depending on which of the two models has been selected, we insist on 
using the corresponding textbook test statistic (for the known- variance case). 
While this is clearly somewhat simple-minded, it describes how such a test may 
be conducted in practice when model selection precedes the inference step. It is 
well-known that if one uses this test-statistic and naively compares it to the usual 
normal-based quantiles acting as if the selected model were given a priori, this 
results in a test with severe size-distortions, see, e.g., Kabaila and Leeb (2006) 
and references therein. Hence, while sticking with T„(ao) as the test-statistic, 
we now look for appropriate critical values in the spirit of the preceding section 
and discuss some of the proposals in the literature. Note that the situation just 
described fits into the framework of the preceding section with p as the nuisance 
parameter and B = M. 

Calculations similar to the ones in Leeb and Potscher (2005) show that the 
finite-sample distribution of T„(ao) under Hq has a density that is given by 

hnjiu) = A (n^/'^p/ap^n, cj(j)(u + p„ (1 - piy^^^ n^^^P/<Jf3^n) 

+ (l - A ((1 - p^) -^/^ (nV^^/a,,„ + p„«) , (1 - pI) c)) <A {u) , 

where A(a, h) = <^(a + h) — $(a — h). Let -ffn,/3 denote the corresponding cumu- 
lative distribution function. 

Now, for given significance level 5, < (5 < 1, let Cn^piS) — H^^p{l — 5) as 
in the preceding section. Note that the inverse function exists, since H„,,p is 
continuous and is strictly increasing as its density hn,0 is positive everywhere. 
As in the preceding section let 

Cn,sup(<5) = SUpC„,/3((5) (10) 

/3eM 
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denote the conservative critical value (the supremum is actually a maximum in 
the interesting case 6 < 1/2 in view of Lemmata [5] and |6] in the Appendix) . Let 
p parametric bootstrap based random critical value. With 77 

satisfying < rj < S, we also consider the random critical value 

Cn,n,Loh{5) = sup Cn,p{5 - 7]) (11) 



where 



/?„([/) ±n-i/2a^.„$-i(l- (77/2)) 



is an 1 — ?7 confidence interval for /3. [Again the supremum is actually a maxi- 
mum.] We choose here rj independent of n as in McCloseky (2011) and DiTraglia 
(2011) and comment on sample size dependent rj below. Furthermore define 

Cn,,,,min(^) = miu (c„,sup(<5), C„,,,,Lo/!.(^)) • (12) 

In the context of testing post-model-selection (an asymptotic version of) the crit- 
ical value given in (ITOl) has been considered in Andrews and Guggenberger (2009) 
and the corresponding test is called a " size-corrected fixed-critical- value test" . 
In the same context, the critical value dTTI) is considered in DiTraglia (2011) and 
McCloseky (2011), and the critical value (HH) is proposed in McCloseky (2011) 
(more precisely, the critical value Cn,Mcc{^) defined in McCloseky (2011) is less 
than or equal to (fT2|) '). In the closely related context of testing post-model- 
averaging, Liu (2011) considered the parametric bootstrap-based critical value, 
i.e., the analogue of c^^ ^ (u)(^)- 

While the critical values pUj) and pT|) lead to tests that are valid level 6 tests 
(and have been proposed in the statistics literature much earlier as discussed 
in the preceding section), we next show that - as suggested by the discussion 
in the preceding section - the proposals by McCloseky (2011) and Liu (2011) 
do not lead to tests that have level S; furthermore, we not only show that the 
overshoot of the size of these tests over S is strictly positive, we also show that 
the overshoot does not converge to zero as sample size goes to infinity. [In 
this preliminary version we show this only for some choices of t] for McCloseky' 
procedure, but a more general result can be established.] This casts severe 
doubt on the results in Liu (2011) and McCloseky (2011). For simplicity the 
next theorem considers only the case p„ = p, but the result extends to the more 
general case where p„ may depend on n. 

Theorem 2 Suppose p„ = p 7^ and let < S < 1/2 be arbitrary. Then 

mf supP„,ao,^ (Tniao) > c„3^(u)('5)) > 6. (13) 

Furthermore, there exist < rj < S such that we have 

inf SUpPn,ao,l3 (7n(ao) > m > 6, (14) 

">1/3GR 
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and consequently also 

inf SUpPn,ao,l3 {T„{ao) > Cn.Mcc[5)) > S. (15) 
">1/3gR 

In fact, the suprema in the above displays actually do not depend on n. 

Proof. We first prove (fT4|) . Introduce the abbreviation 7 = n^^"^ 13 / o p^n and 
define 7(t/) = n-'^/^/3([/)/(T^.„. Observe that the density /i„.^ (and hence the 
cdf Hn_f}) depends on the nuisance parameter j3 only via 7, and otherwise is 
independent of sample size n (since p„ = p is assumed). Let be the density 
of r„(ao) when expressed in the reparameterization 7. As a consequence, the 
quantiles satisfy Cn,i3{v) = c^(v) for every < u < 1, where c^(v) = H^^{1 — v) 
and Hj denotes the cdf corresponding to h^. Furthermore, for < 77 < 5, 
observe that Cn,n,Loh(,^) = sup^gj^ Cn,p{S — rf) can be rewritten as 

Cn,r,,Loh{5) ^ SUp C^{5-Tf). 

7e[7(f/)±*-Mi-('j/2))] 

Now define 7™^'^ — 7'°^''((5) as a value of 7 such that c^max((5) = Csup(^) := 
sup^g^ c-y(5). That such a maximizer exists follows form Lemmata [5] and [S] 
in the Appendix. Note that 7™^'^ does not depend on n. Of course, 7™^'^ is 
related to = PT'i^) via 7'""'^ = ni/2/3^--/a^ „. Since c,up(<5) = c^m.x(5) 

is strictly larger than 

lim c^((5) = $"1(1 - 5) 

in view of Lemmata [5] and |6] in the Appendix, we can find a (sufficiently small) 
77, < 7? < 5, such that 

lim C^{5 -11)= $"1(1 - ((5 - 77)) < Csup(^) = C^max(5). 

Let now e > satisfy e < Csup(<5) — $"^(1 — (5 — 77)). By continuity of c^{5 — r/) 
w.r.t. 7 we see that there exists M = M{e) > such that for I7I > M we have 
Cj{5 — 77) < Csup('5) — £• Define the set 

A = {x e M : |a;| > $"^(1 - {rj/2)) + M} . 

Then on the event {jiU) S A} we have that Cn,r],min{S) < Csup(^) — £• Further- 
more, noting that Pn,Qo,,9»»- (rn(ao) > C„,sup((5)) = Pn,ao,/3»''=' (T'n(ao) > Csup((5)) = 

(5, we have 

SUpPn,ao,P (T'n(ao) > (S)) > Pn 

Pn iCtOi/S^"^ (7ri(Q^o) > Csup('^)) + -Pn.aoi^*™""" (Cn, j),min (,5) < T„(ao) < Csup('5)) 

> ^ + ^'n,ao,^S^"" (C„,^,min('5) < T„(ao) < Csup{S),j{U) € A) 

> ^ + Pn,ao,P^^' {CsnpiS) - E < T^{ao) < C^^p{6) , ^ {U) G A) . 

We are hence done if we can show that the probability in the last line is positive, 
observing that this probability clearly is independent of 77 since under Pn,ao,li"^'^^ 
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the statistic r„(ao) has cdf iJ^max. Now a simple (perhaps loose) lower bound 
is as follows 

Pn .Q.Q,^^^^^ (Csup 

= -P«,ao,/3— (csupW - e < T„(ao) < Csup(<5), 7( G A 17(^01 < c) 
(5) - £ < T„(ao) < c,up(<5),7(f/) e A |7([/)| > c) 

= -Pn,ao,,3»- (c™p(^) > n^'^ (HR) - «0) / (^Ta^n (l " p') '^') > 

c,up{S)-e,j{U)eA,\j{U)\<c) 

CsupiS)-e,j{U)eA,\^{U}\>c) 
= [*(csup(<5)) - $(csup(<5) - £)] Pr (^2 e A, \Z2\ < c) 
+ Pr{csup{S) >Zi> CsupiS) -e,Z2e A,\Z2\> c) 

where we have made use of independence of a(i?) and j{U), cf. Lemma A.l in 
Leeb and Potscher (2003), and where 

(^l,^2)'^A^((0,7.„aJ',( J 1 )) 

is a non-singular normal distribution since \p\ < 1. Now it is obvious that 
the probability in the last line of the above display is strictly positive and is 
independent of n. This proves (|14l) . Since Cn,McciS) < Cn,Ti,min{S) the result 
(fT5|) follows immediately. 

We turn to the proof of (fT3|) . Observe that ^ (u)(^) ~ ^7(c^)('^) ^^"^ ^^^^ 

Csup{S) = c^max(5) > lim c^{S) = $"^(1 - (5) 

in view of Lemmata [5] and [6] in the Appendix. Choose e > to satisfy e < 
Csup(<5) — $^^(1 — S). By continuity of c^{5) w.r.t. 7 we see that there exists 
M = M{e) > such that for I7I > M we have c^{S) < Csup((5) - £• Define the 
set 

B ^ {x eR: \x\> M} . 
Then on the event {7( [/) S B} we have that c^^ ^ ^ (5) = c.^( j/) (6) < Csup ((5) — £. 
The rest of the proof is then completely analogous to the proof of (fT4| with the 
set A replaced by i?. ■ 

Remark 3 If we allow 77 to depend on n, we may choose 77 = r;„ — )• 1 as ?i — > cxd. 
Then the test based on Cn, T;,min(^) still has a positive overshoot for every sample 
size, but the overshoot will go to zero as ?^ —J' 00. But the test then "approaches" 
the conservative test that uses Csup('5), and does not respect the level for any 
finite sample size. Contrast this with the test based on Cn,ri,Loh{5) which holds 
the level at each sample size, and also "approaches" the conservative test if 
77„ ^ 1. Hence, there seems to be little reason for preferring 
Cn,r],Loh{S) in this scenario. 
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A Appendix 

Lemma 4 Suppose a random variable c satisfies Pr (c„ < c*) = 1 for some real 
number c* as well as Pr (c„ < c*) > 0. Let S be real-valued random variable. If 
for every non-empty interval J in the real line 

Pr (S- e J I c„) > (16) 

holds almost surely, then 

Pr(c„ <S <c*)>0. 

The same conclusion holds if in I116\) the conditioning variable c„ is replaced by 
some variable Wn. say, provided that Cn is a measurable function of Wn. 



10 



Proof. Clearly 



PY{cn<S<c*)^E [Pr {S e (c„, c*] I c„)] = E [Pr (5 G (c„, c*] \ c„) 1 (c„ < c*)] , 

the last equality being true since the first term in the product is zero on the 
event c„ = c*. Now note that the first factor in the expectation on the far 
right-hand side of the above equality is positive almost surely by and that 
the event {c„ < c*} has positive probability by assumption. ■ 

Lemma 5 Assume p„ = /? 7^ 0. Suppose < w < 1. Then the map 7 — > c-y{v) is 
continuous on M. Furthermore, lim^_^oo c-y(u) = lim^_i._oo c-f{v) = $^"'^(1 — v). 

Proof. If 7; — > 7 then h^^ converges to pointwise on M. By Scheffe's 
Lemma, H^^ then converges to H-y in total variation distance. S ince is 
strictly increasing on M, convergence of the quantiles c-y^(v) to c^(i') follows. 
The second claim follows by the same argument observing that hj converges 
pointwise to a standard normal density for 7 — >■ ±00. ■ 

Lemma 6 Assume p„ = p 7^ 0. 

(i) Suppose < V < 1/2. Then for some 7 G M we have that Cj{v) is larger 
then $^^(1 - v). 

(a) Suppose 1/2 < u < 1. Then for some 7 G K we have that c^{v) is 
smaller then $^^(1 — v). 

Proof. Standard regression theory gives 

with an{R) and /?„(?/) being independent; for the latter cf., e.g., Leeb and 
Potscher (2003), Lemma A.l. Consequently, it is easy to see that the distribu- 
tion of T„(q!o) under Pn,ao.p is the same as the distribution of 

r = T'ip, 7) = (v/l-p^W^ + pz)l{\Z + j\> c}+ (w - P^—f) 1 {|Z + 7| < c} , 

where as before 7 = n^/^/3/(T^,„, and where W and Z are independent standard 
normal random variables. 

We now prove (i): Let q be shorthand for $~^(1 — v) and note that q > 
holds by the assumption on u. It suffices to show that Pr (T' < q) < $(g) for 
some 7. Now we can write 

PT{T'<q) ^ Pr (^y/l - p^W + pZ <(^- Pr {^Z + ^\ < c,W < J~ ^ 
-^Pr ( IZ + 7I < c,W^ < g -' ' 



= -Pr(A) +Pr(B). 
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Here, A and B are the events given in terms of W and Z. Picturing these two 
events as subsets of the plane (with the horizontal axis corresponding to Z and 
the vertical axis corresponding to W), we see that A corresponds to the vertical 
band where |Z + 7| < c, truncated above the line where W — (q — pZ)/y^l — p^; 
similarly, B corresponds to the same vertical band \Z + j\ < c, truncated now 
above the horizontal line where W = q + p^j \/\ — p^. 

We only consider the case where p > in the following, because the case 
where p < is treated similarly, mutatis mutandis. We distinguish two cases: 

Case 1: pc< ^1 — \/\ — p^ q 

In this case the set B is contained in A for every value of 7, with A\B being 
a set of positive Lebesgue measure. Consequently, Pr(A) > Pr(S) holds for 

every 7, proving the claim. 

Case 2: pc> [\- yj\ - p^ q 

In this case choose 7 so that —7 — c > 0, and, in addition, such that also 
((jf— /)(— 7— c))/ yi — p^ < 0, which is clearly possible. Recalling that p > 0, note 
that the point where the line W = {q — pZ) / — fp- intersects the horizontal 
line W = q+pj/y^l — p^ has as its first coordinate Z = — 7+(g/p)(l — \/l — p^), 
implying that the intersection occurs in the right half of the band where | Z+^\ < 
c. As a consequence, Pr(i?) — Pr(A) can be written as follows: 

Pr(B) - Pr(A) = Pr(B\A) - Pr(yl\B) 

where 

B\A = {-7 + (g/p)(l - Vl - P') < ^ < -7 + c, (g - pZ)/Vl - p2 <W<q + P7/V1-P'} 
and 

A\B = {-7 - C < Z < -7 + (g/p)(l - Vl-P'),'? + PlN^ - <W <{q- pZ)/Vl-p2} . 

Picturing and as subsets of the plane as in the preceding para- 

graph, we see that these events correspond to two triangles, where the trian- 
gle corresponding to A\B is larger than or equal (in Lebesgue measure) to 
that corresponding to B\A. Since 7 was chosen to satisfy —7 — c > and 
{1 ~ P{~1 ~ c))/-\/l — p2 < 0, we see that each point in the triangle corre- 
sponding to A\B is closer to the origin than any point in the triangle cor- 
responding to B\A. Because the joint Lebesgue density of {Z,W), i.e., the 
bivariate standard Gaussian density, is spherically symmetric, it follows that 
Pt{B\A) - Ft{A\B) < 0, as required. 

Part (ii) follows since T'{p,j) has the same distribution as —T'{—p, —7). ■ 

Remark 7 If p„ = p ^ and = 1/2, then co(l/2) = $"1(1/2) = since ho 
is symmetric about zero. 

Remairk 8 If p„ = p = then T„(ao) is standard normally distributed for 
every value of /3, and hence c^{v) = $"^(1 — v) holds for every 7 and v. 
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